Unlocking the Semantics of Roget?s Thesaurus Using Formal Concept Analysis
نویسنده
چکیده
Roget’s Thesaurus is a semantic dictionary that is organized by concepts rather than words. It has an elaborate implicit structure that has not, in the 150 years since its inception, been made explicit. Formal Concept Analysis (FCA) is a tool that can be used by researchers for the organization, analysis and visualization of complex hidden structures. In this paper we illustrate two ways in which FCA is being used to explicate the implicit structures in Roget’s Thesaurus: implications and Type-10 chain components.
منابع مشابه
Automatic Retrieval and Clustering of Similar Words
Bootstrapping semantics from text is one of the greatest challenges in natural language learning. We first define a word similarity measure based on the distributional pattern of words. The similarity measure allows us to construct a thesaurus using a parsed corpus. We then present a new evaluation methodology for the automatically constructed thesaurus. The evaluation results show that the the...
متن کاملMeasuring Semantic Distance using Distributional Profiles of Concepts
Automatic measures of semantic distance can be classified into two kinds: (1) those, such as WordNet, that rely on the structure of manually created lexical resources and (2) those that rely only on co-occurrence statistics from large corpora. Each kind has inherent strengths and limitations. Here we present a hybrid approach that combines corpus statistics with the structure of a Roget-like th...
متن کاملConcept Neighbourhoods in Knowledge Organisation Systems
This paper discusses the application of concept neighbourhoods (in the sense of Formal Concept Analysis) to Knowledge Organisation Systems. Examples are provided using Roget’s Thesaurus, WordNet and Wikipedia cat-
متن کاملHomograph Disambiguation Using Formal Concept Analysis
Homographs are words with identical spellings but different origins and meanings. Natural language processing must deal with the disambiguation of homographs and the attribution of senses to them. Advances have been made using context to discriminate homographs, but the problem is still open. Disambiguating homographs is possible using formal concept analysis. This paper discusses the issues, i...
متن کاملRevisiting the Potentialities of a Mechanical Thesaurus
This paper revisits the lattice-based thesaurus models which Margaret Masterman used for machine translation in the 1950’s and 60’s. Masterman’s notions are mapped onto modern, Formal Concept Analysis (FCA) terminology and three of her thesaurus algorithms are formalised with FCA methods. The impact of the historical and social situatedness of Roget’s Thesaurus on such algorithms is considered....
متن کامل